%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
README File for “Can quotas increase the supply of candidates for higher-level positions? Evidence from local government in India" by Stephen D. O'Connell
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

In this file, I describe how to replicate the analysis from raw data. In cases where the raw data files could not be provided or are available to the public, I have provided instructions on how to acquire them.

I used STATA 13/14 to perform the analysis on MAC OSX.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Data
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
The analysis relies on four core data sources. These are:

1) District chairperson panel data, 1995 to 2007. Available from the replication backup of Lakshmi Iyer, Anandi Mani, Prachi Mishra and Petia Topalova “The Power of Political Voice: Women’s Political Representation and Crime in India.” American Economic Journal: Applied Economics (AEJApp_2011_0220). Available for free via download at publisher's website.

2) State legislative assembly elections data: 
1961 to 2007: acquired from Francesa Jensenius upon request.
Supplementary AC election turnout data were acquired from Yogesh Uppal upon request.

3) Parliament (Lok Sabha) elections data: available publicly online from various sources; data used in the analysis and included in this replication archive was downloaded from raw via query at www.empoweringindia.org (now defunct).

4) Geospatial data on administrative boundaries (districts) and parliamentary and assembly constituencies. District boundary shapefiles are available from Global Administrative Areas (https://gadm.org/, last accessed 4 October 2018). Pre-2007 Assembly constituency geospatial data are available from Sandip Sukhtankar's website (http://www.dartmouth.edu/~sandip/data.html, last accessed 4 October 2018). Parliamentary constituency shapefiles are available for download for free from https://github.com/datameet/maps/tree/master/parliamentary-constituencies (last accessed 4 October 2018).


Secondary datasets used in the analysis come from:
1) 1991 Population Census of India: district totals, available for free via download: http://censusindia.gov.in/DigitalLibrary/TableSeries.aspx
2) Gridded Population of the World, available for free via download at http://sedac.ciesin.columbia.edu/data/collection/gpw-v4
3) Data from background/web research on parliamentary candidates, provided with replication files


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Programs/scripts:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
STATA do-files are provided in two folders:

"2progs" contains data preparation scripts which read, clean, and merge the datasets for analysis. Users must acquire the data sources above and place them in the appropriate subfolder within "/1data/". Users will also rely on intersected map files which provide the basis for the analysis datasets, provided with the replication backup in the "/9maps" folder. These files are exports of an attribute table of intersected polygon layers (district boundaries intersected with either parliamentary constituencies or assembly constituencies). Both intersection files needed to generate the analysis dataset are provided in the replication backup.

This folder contains 5 scripts, which are intended to be run in order:

1) 1_prep_assembly_elections_data.do: imports and cleans assembly elections data
2) 2_prep_parliamentary_data.do: imports, cleans, and links election records across parliamentary elections.
3) 3_Match.SA.Across.Elections.do: cleans and links election records across state assembly (AC) elections and aggregates candidate counts by constituency - election year.
4) 4_candidates_sumstats_prep_candidacy_summary.do: aggregates candidate counts by constituency - election year for parliamentary elections.
5) 5_read_GPW_data.do: reads and cleans the "Gridded Population of the World" dataset (used for alternative weightings).

This process results in two files ("PC_analysis_dataset" and "AC_analysis_dataset")
 provided with the replication backup and used as the basis for the analysis.


Folder "3tablesfigures" contains 8 scripts used to produce the inputs to all tables and figures in the manuscript.

1)1_Figure1_Map.do : takes state and district shape file data and merges with the chairperson data to produce Figure 1.
2)2a_Figure2dataprep.do: formats data for Figure 2.
3)2a_Figure2.R: produces Figure 2.
4)3_Figure3.do: produces panels of Figure 3
5)4a_Prep_PCs_Panel: Final data merging to produce analysis dataset (parliamentary elections outcomes)
6)4b_Prep_ACs_Panel: Final data merging to produce analysis dataset (state legislative assembly outcomes)
7)5a_FinalTables_PCs_Panel: Produces the inputs for all manuscript tables (parliamentary elections outcomes)
8)5b_FinalTables_ACs_Panel: Produces the inputs for all manuscript tables (state legislative assembly outcomes)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Primary variables used in Analysis:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%
AC_analysis_dataset.dta - variables used in analysis:
%%%%%%%%%%%%%%%%
female_candidate - number of female candidates running in the assembly constituency
male_cand - number of male candidates running in the assembly constituency
cum_distres - cumulative years of chairperson reservation as of the year of the election
state - state name
AC - constituency name
district - district name
constituency_pop_weight - share of the constituency population (based on gridded population estimates) in the district-constituency subpart overlap

rural_women_literacy - district rural women's literacy rate based on 1991 Population Census
rural_women_midsch_rate - district rural women's middle school completion rate based on 1991 Population Census
rural_SCST - district rural SC/ST population share based on 1991 Population Census
rural_sexratio - district rural sex ratio based on 1991 Population Census
female_share_cand_prepolicy - share of candidates female in early 1990's election in constituency


female_share_candidates - share of candidates female in assembly constituency election
female_votes - share of votes to female candidates in assembly constituency election
female_winner - indicator if female candidate won election in assembly constituency
female_finish_top5 - indicator if female candidate finished in the top 5 finishers in assembly constituency election
female_finish_top30th - indicator if female candidate finished in top 30% of finishers in assembly constituency election


turnout - Voter turnout in assembly constituency election
female_share_voters_AC - Female share of voters in assembly constituency election
fturnout - Female voter turnout in assembly constituency election


%%%%%%%%%%%%%%%%
PC_analysis_dataset.dta - variables used in analysis
%%%%%%%%%%%%%%%%
female_candidate - number of female candidates running in the parliamentary constituency
male_cand - number of male candidates running in the parliamentary constituency
cum_distres_2007 - cumulative years of chairperson reservation as of the year of the election
state  - state name
constituency - constituency name
district - district name
constituency_pop_weight - share of the constituency population (based on gridded population estimates) in the district-constituency subpart overlap

rural_women_literacy - district rural women's literacy rate based on 1991 Population Census
rural_women_midsch_rate - district rural women's middle school completion rate based on 1991 Population Census
rural_SCST - district rural SC/ST population share based on 1991 Population Census
rural_sexratio - district rural sex ratio based on 1991 Population Census
female_share_candidates_1991 - share of candidates female in early 1990's election in constituency


female_share_candidates - share of candidates female in parliamentary constituency election
female_votes - share of votes to female candidates in parliamentary constituency election
female_winner - indicator if female candidate won election in parliamentary constituency
female_finish_top5 - indicator if female candidate finished in the top 5 finishers in parliamentary constituency election
female_finish_top30th - indicator if female candidate finished in top 30% of finishers in parliamentary constituency election


female_candidate_majorparty - count of female candidates from a major party
female_candidate_minorparty - count of female candidates from a minor party
female_candidate_independent - count of female independent candidates
majorparty_votes - share of votes to a major party

incumbent_run - whether incumbent ran in election
incumbent_majorparty - whether the incumbent running was from a major party

voter_turnout - Voter turnout in parliamentary constituency election
female_share_voters - Female share of voters in parliamentary constituency election
female_voter_turnout  - Female voter turnout in parliamentary constituency election








